Pollution Attestation

Introduction

The Pollution Attestation Schema extends the generic Attestation Schema to enable attestations related to data pollution. By referencing pollution event details such as the type of pollution (e.g., label noise, bias amplification, concept drift, distribution shift, etc.), the date, and severity, this schema ensures a structured way to document and track reported data pollution events.

Description

This schema includes:

Type: The attestation type, set to "Pollution".
Pollution Details: Provides details of the data pollution event, including pollution type, date, severity and observed impact.

Use Case

The Pollution Attestation Schema is used to:

Document Pollution Events: Attach details of pollution events to attestations for components.
Track Data Integrity Issues: Keep track of components affected by pollution data and trace the impact on downstream models or decisions.
Support Compliance & Fairness: Ensure that pollution components are monitored and addressed in line with ethical AI and regulatory standards.

This schema promotes transparency and accountability, ensuring that data pollution issues are communicated clearly across software supply chains. It also helps organizations trace how pollution data may have impacted AI model behavior and mitigate potential risks.

Schemas

yaml
json
markdown

$id: https://github.com/nqminds/Trusted-AI-BOM/blob/main/packages/schemas/src/taibom-schemas/66-pollution-attestation.v1.0.0.schema.yaml
$schema: https://json-schema.org/draft/2019-09/schema
title: Pollution Attestation
description: |
  This schema extends the generic Attestation Schema to define an attestation that a component is pollution
type: object
properties:
  component:
    type: object
    description: Component reference, including an ID and hash for the VC claim.
    properties:
      id:
        type: string
        description: The component ID (unique identifier) of the VC claim.
      hash:
        type: string
        description: Cryptographic hash (e.g., SHA-256) for verifying the integrity of the VC claim.
    required:
      - id
      - hash
  attestation:
    type: object
    properties:
      type:
        type: string
        enum:
          - pollution
        description: Type of attestation, set to "Pollution" for this schema.
      pollution:
        type: object
        description: Data pollution event that applies to the particular component.
        properties:
          date:
            type: string
            description: The date when the data pollution event was identified.
          type:
            type: string
            description: The type of data pollution (e.g., label noise, bias amplification, concept drift, distribution shift).
          severity:
            type: string
            description: Severity of the data pollution issue.
          description:
            type: string
            description: Detailed description of the data pollution event.
          observed_impact:
            type: string
            description: How the pollution manifests in predictions or decisions.
          detection_method:
            type: string
            description: How the issue was detected (e.g., dataset audit, fairness metric analysis).
        required:
          - date
          - type
    required:
      - type
      - pollution
required:
  - component
  - attestation

{
  "$id": "https://github.com/nqminds/Trusted-AI-BOM/blob/main/packages/schemas/src/taibom-schemas/66-pollution-attestation.v1.0.0.schema.yaml",
  "$schema": "https://json-schema.org/draft/2019-09/schema",
  "title": "Pollution Attestation",
  "description": "This schema extends the generic Attestation Schema to define an attestation that a component is pollution\n",
  "type": "object",
  "properties": {
    "component": {
      "type": "object",
      "description": "Component reference, including an ID and hash for the VC claim.",
      "properties": {
        "id": {
          "type": "string",
          "description": "The component ID (unique identifier) of the VC claim."
        },
        "hash": {
          "type": "string",
          "description": "Cryptographic hash (e.g., SHA-256) for verifying the integrity of the VC claim."
        }
      },
      "required": [
        "id",
        "hash"
      ]
    },
    "attestation": {
      "type": "object",
      "properties": {
        "type": {
          "type": "string",
          "enum": [
            "pollution"
          ],
          "description": "Type of attestation, set to \"Pollution\" for this schema."
        },
        "pollution": {
          "type": "object",
          "description": "Data pollution event that applies to the particular component.",
          "properties": {
            "date": {
              "type": "string",
              "description": "The date when the data pollution event was identified."
            },
            "type": {
              "type": "string",
              "description": "The type of data pollution (e.g., label noise, bias amplification, concept drift, distribution shift)."
            },
            "severity": {
              "type": "string",
              "description": "Severity of the data pollution issue."
            },
            "description": {
              "type": "string",
              "description": "Detailed description of the data pollution event."
            },
            "observed_impact": {
              "type": "string",
              "description": "How the pollution manifests in predictions or decisions."
            },
            "detection_method": {
              "type": "string",
              "description": "How the issue was detected (e.g., dataset audit, fairness metric analysis)."
            }
          },
          "required": [
            "date",
            "type"
          ]
        }
      },
      "required": [
        "type",
        "pollution"
      ]
    }
  },
  "required": [
    "component",
    "attestation"
  ]
}

Pollution Attestation

This schema extends the generic Attestation Schema to define an attestation that a component is pollution

The schema defines the following properties:

`component` (object, required)

Component reference, including an ID and hash for the VC claim.

Properties of the component object:

`id` (string, required)

The component ID (unique identifier) of the VC claim.

`hash` (string, required)

Cryptographic hash (e.g., SHA-256) for verifying the integrity of the VC claim.

`attestation` (object, required)

Properties of the attestation object:

`type` (string, enum, required)

Type of attestation, set to "Pollution" for this schema.

This element must be one of the following enum values:

pollution

`pollution` (object, required)

Data pollution event that applies to the particular component.

Properties of the pollution object:

`date` (string, required)

The date when the data pollution event was identified.

`type` (string, required)

The type of data pollution (e.g., label noise, bias amplification, concept drift, distribution shift).

`severity` (string)

Severity of the data pollution issue.

`description` (string)

Detailed description of the data pollution event.

`observed_impact` (string)

How the pollution manifests in predictions or decisions.

`detection_method` (string)

How the issue was detected (e.g., dataset audit, fairness metric analysis).

Examples

table
json

component	attestation
[object Object]	[object Object]

[
  {
    "component": {
      "id": "urn:uuid:123e4567-e89b-12d3-a456-426614174000",
      "hash": "f1e2d3c4b5a697887766554433221100"
    },
    "attestation": {
      "type": "pollution",
      "pollution": {
        "date": "2024-02-15",
        "type": "Bias Amplification",
        "severity": "medium",
        "description": "Training data contained an overrepresentation of negative sentiment from specific demographics, leading to bias sentiment analysis results.",
        "observed_impact": "Negative sentiment was disproportionately assigned to reviews from certain user groups.",
        "detection_method": "Fairness analysis detected a higher false-negative rate for positive reviews from affected groups."
      }
    }
  }
]

Edit this schema here

Introduction​

Description​

Use Case​

Schemas​

Pollution Attestation

component (object, required)​

id (string, required)​

hash (string, required)​

attestation (object, required)​

type (string, enum, required)​

pollution (object, required)​

date (string, required)​

type (string, required)​

severity (string)​

description (string)​

observed_impact (string)​

detection_method (string)​

Examples​

Introduction

Description

Use Case

Schemas

`component` (object, required)

`id` (string, required)

`hash` (string, required)

`attestation` (object, required)

`type` (string, enum, required)

`pollution` (object, required)

`date` (string, required)

`type` (string, required)

`severity` (string)

`description` (string)

`observed_impact` (string)

`detection_method` (string)

Examples